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" Abstract 

In this paper, a deterministic equivalent of ergodic sum rate and an algorithm for evaluating the 
O | capacity-achieving input covariance matrices for the uplink large-scale multiple-input multiple-output 

" (MIMO) antenna channels arc proposed. We consider a large-scale MIMO system consisting of multiple 

users and one base station with several distributed antenna sets. Each link between a user and an 
antenna set forms a two-sided spatially correlated MIMO channel with line-of-sight (LOS) components. 
Our derivations are based on novel techniques from large dimensional random matrix theory (RMT) 
under the assumption that the numbers of antennas at the terminals approach to infinity with a fixed 

O 

ratio. The deterministic equivalent results (the deterministic equivalent of ergodic sum rate and the 



capacity-achieving input covariance matrices) are easy to compute and shown to be accurate for realistic 



(N 

>: 

system dimensions. In addition, they are shown to be invariant to several types of fading distribution. 



Index Terms — Deterministic equivalent, large dimensional RMT, large-scale MIMO, Stieltjes trans- 

OV 

form. 

■ 

1 Introduction 

^ ■ 

X 

$_i \ To achieve higher rates, much efforts have been put to improving the spectral efficiency and data throughput 
of wireless communication systems. The multi- antenna technology is one key technology for wireless 
communication and is envisaged to be adopted ubiquitously. With the number of antennas at the base 
stations (BSs) and user equipments (UEs) being increased, communications systems will have better rate 
and link reliability [1, 2]. However, the actual achievable spectral efficiency could be greatly compromised 
by interference arising from simultaneous communications in neighboring areas. 
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Figure 1: A vision for a possible large-scale MIMO system. 



A promising solution to interference management is the large-scale multiple-input multiple-output 
(MIMO) technology, e.g., [3-7]. Figure 1 illustrates a possible scenario where the antenna array of a BS 
is composed of multiple geographically distributed low-power antenna sets, installed onto a ring of high- 
speed fibre-bus, and this BS is communicating with several multi-antenna UEs. The large-scale MIMO 
setting is beneficial not only in terms of communication performances (such as better coverage and efficient 
radio resource utilization) but also in terms of energy-saving. 1 In this complex system model, a number of 
practical factors such as correlation effects and line-of-sight (LOS) components need to be included, which 
occur due to the space limitation of UEs and the densification of the antenna arrays resulting in a visible 
propagation path from the UEs, respectively. For typical systems of tens of distributed antenna sets and 
hundreds of UEs, even computer simulations become challenging [9], which makes performance analysis of 
such large-scale MIMO systems an important and a new subject of research. 

When a system is large, exact performance analysis is no longer suitable because an exact analytical 
expression would be too complex to appreciate. Hence, alternatives have emerged and the large dimensional 
random matrix theory (RMT) [6, 10-19] provides a powerful tool in dealing with large-scale MIMO systems. 
Utilizing the large dimensional RMT, this paper aims to derive information-theoretic results of the large- 
scale MIMO systems. In particular, our focus is on the uplink large-scale MIMO systems consisting of K 
UEs and a BS with L distributed sets of multiple antennas. Let n& and Ni denote, respectively, the numbers 

of antennas at the fe-th UE and the Z-th antenna set of the BS receiver. The channel between the k-th 

i i 

UE and the l-th antenna set is modeled as the Ni x n& complex matrix = RAX/^T, 2 ,, + H^., where 

X; fc's are statistically independent random matrices of independent and identically distributed (i.i.d.) 

entries (but not necessary Gaussian 2 ), H/ & is a deterministic matrix reflecting the LOS components of the 

1 Using the setting, the number of BS can be greatly reduced. Note that the energy consumption for air conditioning for 
each BS is consuming up to 20,000 kWh each year on average which is sometimes higher than other equipments in a BS [8]. 
2 Despite the Rayleigh or Rician distribution being the most popular distributions for small-scale amplitude fading, there 
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channel, and Rj ^ and T;^, respectively, characterize the spatial correlation structures at the receiver and 
transmitter sides separatively. Since the signals from multiple antenna sets are collected into a BS, the 
corresponding channel matrix of UE k can be expressed as = [H^ fc • • • fc ] T . An important objective 
of this study is to obtain a deterministic equivalent of the ergodic sum rate for the distributed uplink 
MIMO channel X^fcLi HfcH^ so that the system sum rate can be efficiently and accurately computed. 

Although there have been quite many such results on MIMO capacity analysis utilizing large dimen- 
sional RMT [10, 12-16], the general model studied in this paper has not been addressed. To appreciate the 
objective of this paper, it is important to understand the limitations of the existing results. First, previous 
works in the large-scale MIMO systems usually assumed = 1 and iVj = 1 for all k,l. That is, the UEs 
have only one antenna each and the BS is equipped with completely distributed antennas (i.e., one antenna 
in each antenna set). The elements of this channel matrix merely reflect the path loss differences between 
the links. Regarding the channel model (the channel with a variance profile), the most relevant work is 
[12] (or [11, Theorem 3.8] without the LOS components). In [12], a deterministic equivalent of the mutual 
information 3 was derived based on the Bai-and-Silverstein method [22] (or [17, Chapter 6.2.1]). In fact, 
the results of [12] can be easily extended to the case with > 1 and iVj > 1 but those spatial correlation 
matrices Tj^'s and Rj fc's are required to be diagonal. 

The deterministic approximations of [11, 12] have found many applications in various system optimiza- 
tion designs such as scheduling [9,23], training length designs [24], cell planning [25], and many others 
[11, 17]. This is because the designs based on the deterministic approximations not only can provide an 
efficient computation method but also give insight into what the optimal strategies look like. However, 
inheriting from the limitations of [11, 12], these results do not allow the UEs or each antenna set of the 
BS to be equipped with multiple spatially correlated antennas. Because of the potential applicability of 
deterministic equivalent results to system designs, there is a strong desire to deriving new deterministic 
equivalents as those given in [12] for the general model of our interest. However, even for an extension to the 
one-sided spatially correlated case, there will be several obstacles when one intends to get the deterministic 
equivalent of mutual information by using the Bai-and-Silverstein method and alike. 4 

To date, there are only very few results dealing with random matrix models where the entries are 
correlated across both rows and columns. Most studies only considered random matrices with independent 

are other classes of fading distributions which serve as better models under certain circumstances [20, 21]. 

3 Formally, it should be read as the mutual information between the input and output over the channel with a variance 
profile. In this paper, we often simply refer to it as "the mutual information" if no confusion would occur. 

4 If H;^ = V/, k, a partial generalization is possible by the Bai-and-Silverstein method. Specifically, with minor modifi- 
cations for the case in [16, 26], the asymptotic mutual information can be obtained for the case that Rz,fc's were permitted to 
be nonnegative definite, while H;,fe = and T;,fc's are diagonal. If life's are generally nonnegative definite, difficulties arise. 
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complex Gaussian random variables and used the fact that the correlated Gaussian random matrix can be 
transformed to an uncorrelated one with non-identically distributed entries without changing the concerned 
objects (e.g., the eigenvalue distribution and the mutual information). For convenience, we will refer to this 
transformation as the decorrelation procedure. Because of the assumption of Gaussianity the entries are in 
fact uncorrelated, and so the Bai-and-Silverstein method can be used. For the latest results using this trick, 
refer to, e.g., [16]. Unfortunately, the channel model of our interest (i.e., H^) cannot be transformed to a 
Gaussian random matrix with uncorrelated columns even if X^'s are assumed to be Gaussian. For this 
to be possible, it would require that Ti ^, . . . , T^fc be simultaneously unitarily diagonalizable for every k. 
Clearly, this restriction in the model does not permit UEs to have multiple spatially correlated antennas, 
which is unrealistic and greatly limits the significance of the model. 

If the entries of the random matrices are Gaussian, then an alternative method, known as the Gaussian 
method [27] (the integration by part formula and Poincare-Nash inequality), is much more useful. In 
this context, Hachem et al. [13, 15] have succeeded in obtaining the deterministic equivalent of mutual 
information for Kronecker (or separately) correlated Rayleigh and Rician MIMO channels. Compared to 
the Bai-and-Silverstein method, the Gaussian method is only suited to random matrices with Gaussian 
entries. However, one may extend the results obtained for matrices with Gaussian entries to any random 
matrices with independent entries following two recent developments, the Lindeberg principle [28] and the 
interpolation trick [29]. For the latest results, see, e.g., [19], where the Lindeberg principle is applied. 

Early analyses using the Gaussian method were only for the typical Kronecker MIMO channel [13, 15]. 5 
In that case, the correlated Gaussian random matrix was transformed into an uncorrelated one, and 
the decorrelation procedure was employed. As such, the Gaussian method was merely an alternative 
tool to study large dimensional random matrices. Its superiority in dealing with random matrices with 
correlated pattern is largely unexplored until most recently, Dupuy and Loubaton in [18] derived the 
deterministic equivalent of average mutual information for a frequency selective MIMO channel, in which 
the decorrelation procedure could not be applied. We believe that the Gaussian method can be useful to 
treat other random matrices with involved correlation. With the aid of the Lindeberg principle, one may 
further extend the results obtained for matrices with Gaussian entries to any random matrices. Following 
this approach, this paper combines the two techniques to get the deterministic equivalents for the concerned 
channel model. 

In particular, we first use the Gaussian method to derive the deterministic equivalent of ergodic sum 
rate for the large-scale MIMO multiple access channel (MAC) when X^'s are Gaussian distributed. Our 

5 In this paper, the typical Kronecker MIMO channel means that K = L = 1. 
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results are much more general and can cope with several complex applications. As a special case, this 
contribution complements the results of [18] by extending the analysis to the case with LOS components. 
This extension is non-trivial. 6 Next, by the generalized Lindeberg principle [28,30], we generalize the 
deterministic equivalent for random matrices with Gaussian entries to those with non-Gaussian entries. 
Simulation results reveal that even for systems with realistic system dimensions, the deterministic approx- 
imation of ergodic sum rate provides reliable estimates to those obtained by Monte-Carlo simulations. 
Then, we apply the approximation to design the input covariances that tend to maximize the ergodic sum 
rate of the large-scale MIMO MAC, and provide an iterative water-filling optimization algorithm when 
only the statistical CSI at the transmitter (precisely, T/^'s, R/^'s, and H/^'s) is available. Finally, we 
conduct several simulations to confirm the comparability between results by our approach and those by 
the true (but time-consuming) optimization procedure under several types of fading distribution. 

Notations — We use uppercase and lowercase boldface letters to denote matrices and vectors, respec- 
tively. Ijv denotes an N x N identity matrix while an all-zero matrix is denoted by 0, and an all-one matrix 
is denoted by 1. The matrix inequality y shows the positive semi-definiteness. The superscripts (■) , (-) T , 
and (•)* represent the conjugate-transpose, transpose, and conjugate operations, respectively. Also, we use 
E{-} to denote expectation with respect to all random variables within the brackets; log(-) is the natural 
logarithm; p(-) denotes the spectral radius (i.e., the largest absolute value of the eigenvalues) of a matrix. 
|| • || represents the Euclidean norm of an input vector or the spectral norm of an input matrix, while || • ||p 
denotes the Frobenius norm of a matrix, and ||| • ||| represents the maximum row sum matrix norm. The 
complex number field is denoted by C. For any matrix A 6 C Nxn , we use [A]/fc, [A]^ or Am to denote the 
(l,k)-th entry, and denotes the k-th entry of the column vector a. The operators (■)*, tr(-) and 

det(-) represent the matrix principal square root, inverse, trace and determinant, respectively. In addition, 
diag(x) denotes a diagonal matrix with an input vector x representing its diagonal elements. 

2 Channel Model and Problem Statement 
2.1 Uplink Large MIMO 

As shown in Figure 1, we consider the large-scale MIMO MAC with K UEs, labeled as UEi, . . . , \JEk, 
which are equipped with n\, . . . ,uk antennas, respectively. The K UEs transmit simultaneously to a 
central coordinator with L distributed antenna sets, labeled as BSi,...,BSl, which are equipped with 

"Using the Gaussian tools, the asymptotic mutual information expressions for Rayleigh fading Kronecker MIMO channels 
were first proved by [13]. Two years later, the authors in the same group generalized the results to Rician fading channels 
[15]. This in some ways reflects the difficulty of such extension even for the typical Kronecker MIMO channel. 
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Ni, . . . , Nl antennas, respectively. In this paper, we use the Kronecker model to characterize the spatial 
correlation of the MIMO channel for each MIMO link so that the correlation at an antenna set and a UE 
is modeled separately, as in [31]. Specifically, the channel from UE^ to BS/, Hi k G C*^"*, can be written 
as 

I I 

H/ 5 fc = H^fc + H; ; fc = R^X^T^ + Hi t k, (1) 

where R; k G C N ' xNl and T; k G C nfcXnfc are deterministic nonnegative definite matrices, characterizing the 
spatial correlation of the received signals across the antenna elements of BS; and that of the transmitted 
signals across the antenna elements of UE/%, respectively; = [-^=X^ ] £ £_ N i* n k consists of the 



random components of the channel in which the elements are i.i.d. complex random variables with zero 
mean and unit variance; and H/^ € C NlXnk is a deterministic matrix corresponding to the channel LOS. 
With the channel given above, we define the Rician factor between UE^ and BS/ as 

I ITT II 2 

K lk = l|H; ' fc|lF 9 . (2) 
' E{||H /ife ||2} 

We also denote the distance-dependent pathloss of the (l,k)-th pair by gi ik = E{||H^jfc|||}/iVi given by 

E{||H, ifc |||} = — tr(R Zifc )tr(T Jifc ) + tr(H Z)fe fif fe ). (3) 
Following the standard conventions [14], Rjfc, Ti k , and k are normalized such that 

tr(Rjfc) = l ——gi^ k N h 

&l,k + 1 

tr(T| lfc ) = n k , (4) 
tr(H,, fc H5) = -^-g^ty. 

K l,k T 1 

It is noted that «j k and gi jk are independent from the matrix dimensions. Therefore, the normalization is 
valid for all possible correlation patterns and imposes no restriction on practical applications. Although 
for convenience purpose we will simply set the same noise level (i.e., a 2 ) at all the receivers, it imposes no 
restriction since one can adjust gi >k to get an arbitrary signal-to-noise ratio (SNR) of the (7,fc)-th pair. In 
addition, the setting implies that the LOS components of some link pairs are allowed to be absent. 
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2.2 Problem Formulation 

The sum rate has been a key metric for performance analysis of a MAC. We begin with the sum rate 
formulation of the large-scale MIMO system and then explain its relation to RMT. For ease of exposition, 



we define N = Y^i=\ N U n - Y^k=\ n fc> H fc 
H = [Hi • • • Hk] G C Nxn , and H 4 [Hi H K ) G 



H i\fc • • • H L,fc 



G C Nxnk , H fc 



g C 



Nxn k 



iNxn 



The channel H^ represents the joint channel 



between UE^ and the L distributed antenna sets interconnected at the BS. Then, the ergodic sum rate of 
the MIMO MAC can be expressed as [32] 



1 



V Bjv K) = -E<Mogdet I I N + —B N 



1 



(5) 



where a 2 is the noise variance at the receivers and 



A' 



iNxN 



(6) 



k=l 



Specifically, Vb n (ct 2 ) provides a performance metric regarding the total number of nats (or bits if in base 
2 of logarithm) per antenna that can be transmitted reliably over the channel matrices {Hk}k=i,...,K- 
The derivative of Vb jv (<7 2 ) with respect to a 2 is given by 



OVbA* 2 ) _ L E Jtr 
da 2 N 1 



Ijv H — ^-Bat 



(7) 



By Fubini's theorem, we have [12, page 891] 



(8) 



where 



"^Bjv(w) = — tr (B^ + wljv) 



-i 



(9) 



In RMT, mB w is referred to as the Stieltjes transform of B^r at point —cj, which provides a convenient 
tool to study the behavior of large dimensional random matrices. The relationship by which the mutual 
information is expressed as a functional of the Stieltjes transform is called the Shannon transform [11, 
Section 2.2.3]. 

In this paper, we are interested in understanding the ergodic sum capacity of the MIMO MAC by using 
large dimensional RMT. In particular, we consider that L, K are fixed but N\, . . . , Nl, m, . . . , nx all go 
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to infinity with ratios {fii : k(N) = ^j-} such that 

< min liminf fe(iV) < max limsup/^^iV) < oo. (10) 

l,k N l,k jy 

For convenience, we refer to this large dimensional regime simply as N — > oo in the sequel. To this end, 
in the next section, we first find a deterministic matrix-valued function *&(u)) G t£ NxN (to be done later) 
such that 

E{m Bj »} - ^tr(*( W )) for ui G R + . (11) 

Following [12] (or [17, Definition 6.1]), we refer to -^tr(\l/(a;)) as the deterministic equivalent of E {m& N (uj)}. 
To appreciate the contributions of this paper, it is worth emphasizing that H^, in general, cannot be written 
in the form (1) using the separable correlation model, because different antenna sets have different spatial 
correlations, and this is the main obstacle of this class of random matrices - otherwise, there are some 
existing results [6,10-16,18,19]. Next, using the Shannon transform (8), we will find Vtv(o" 2 ) so that 
E{Vb n ((7 2 )} - Vn{(J 2 ) -> as N -> oo. Finally, we will use Vn{o~ 2 ) to obtain the optimal input covariance 
matrices that maximize the deterministic approximation of the ergodic sum rate. 

3 Deterministic Equivalents and Ergodic Capacity 
3.1 Deterministic Equivalents 

We first state the assumptions imposed in our system model. 

Assumption 1 Let X; t = \—}=xf-' ] G C N ' xnk , where X-l 's are i.i.d. complex random variables with 
independent real and imaginary parts such that 

E{x[ l { k) } = 0, and E{|xf 1 ' fc) | 2 } = 1, (12) 

and have finite 6-th order moment. 

Assumption 2 The family of deterministic matrices {T/^, Rz,fc}vi,fc is nonnegative definite. In addition, 
the spectral norms o/R^, and H^H^, are bounded by a constant, i.e., 

max max{||R Z)fe ||, ||T,, fc ||, ||H, |fc Hf fc ||} < C max . (13) 

k,l 



s 



To facilitate our expressions, we define the notation (A) fc that returns the submatrix of A obtained by 
extracting the elements of the rows and columns with indices from Yli=i + 1 to Yli=i n i- Similarly, the 
notation {{A)) l returns the submatrix of A obtained by extracting the elements of the rows and columns 
with indices from Ylj^i Nj + 1 to Y^!j=i ■ Also, for convenience, in the paper, we often omit ui when 
writing m BAr , *z, e i)k , ei >k , and denote Yli,k = Yli=il2k=v 

Theorem 1 Let fti^ = Under Assumption 2, the deterministic system of the L x K equations 

e ltk = -Ur(R, )fc ((*)) z ), (14a) 



Ni 
1 

— 1 

nk 



ei,k = — tr(T,, fc (*) fc ), (14b) 



for 1 < I < L and 1 < k < K , where 



*=(r 1 +wH*H ff ) , (15a) 



+wH^*H) , (15b) 

* = diag(*i,...,* L ), (15c) 

$ = diag(*i,...,# if ), (15d) 

&i = ^o;Ijv, + LQ e;,fcR;,fc J , (15e) 

I nk +uJ2^,k%kTi,k] (15f) 



have a unique solution for u £ M + . 

Under Assumptions 1 and 2, as M — > oo, we then have 



E{m Bjv } - 4 tr (*) = \ -7= ) > forojeM + . (16) 



Furthermore, if 's are Gaussian, we have 

1 



E{m Bjv }--tr(*) = O^J, /or W £l + . (17) 

Proof: Here, for ease of understanding, we give an outline of the proof. Our strategy is to show that 
the deterministic equivalent of E{m^ N } [i.e. -^tr(^)] can be found for the Gaussian random matrices and 



9 



then we prove that the result is also applied for the non-Gaussian distributions. 

Let Sm be an JV x JV matrix obtained from B^r in (6) with all X^'s replaced by X\ ^'s, where X\ ^'s 
are matrices with entries being independent standard Gaussian. Using the Gaussian method [27] (the 
integration by part formula and Poincare-Nash inequality), we can show that the error term E{mg N } — 
jjtr(^) is of order O (7^7) • The detailed derivation is given in Appendix A. 

Next, applying the Lindeberg principle [30, Theorem 2], we prove that E{m-B N } — E{rriB N } = O (t^f) • 
The detailed derivation using the Lindeberg principle is provided in Appendix B. Together with the result 
for the Gaussian case, the proof of (16) can be accomplished by noting that 

E {m BN } - -tr (*) = (E {m Bjv } - E {m Bjv } ) + (e {m BN } - — tr (*) 



Finally, we consider the existence and uniqueness of the solution to (14) in Appendix C □ 

Remark 1 If xf- ,k ^ 's are Gaussian, the assumption that X^ 's have finite 6-th order moment is naturally 
satisfied. When the amplitudes of the channel fading coefficients follow the Nakagami and log-normal 
distributions, Theorem 1 is applicable since these distributions have finite 6-th order moment. In Appendix 

r(l,k) , 



B, the proof of E {mB^} — tr( 1 J r ) = O y~^J was given under the assumption that X\- 's have finite 6-th 
order moment. In fact, with additional arguments, the more general case can be obtained. Specifically, if 
xfj k ^ 's have only finite second moment, we can prove that E{?tt,b jv (<^)} — -^ftr(\I/(u;)) = O (s n ), where e n is 
a positive sequence converging to zero. However, it should be noted that with the finite 6-th order moment 
assumption, the proof of E-{m-Q N } — ■^tr(^) = O ("^f) *s much simpler than the latter general case. 
The proof of the general case requires some additional truncation, centralization, and rescaling techniques 
together with some careful derivations as those in [19]. Since these are beyond the scope of this paper, we 
do not show the detail proof regarding this general case. Interested readers can refer to [19]. 

Remark 2 Theorem 1 is developed under the asymptotic regime where L, K are fixed but {Ni,rik}'s all 
grow to infinity with fixed ratios. For other applications, we might be interested in the cases with fixed 
{Ni,rik}'s while L and K grow to infinity. In this case, the entries o/X^^'s will be normalized by y/n 
rather than y/rik and a similar deterministic equivalent result as that of Theorem 1 can be obtained. 

We then derive a deterministic equivalent of the ergodic sum rate of the large-scale MIMO MAC in the 
following theorem. 



Only different in some scalar adjustment. 
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Theorem 2 Assuming that Bjy follows the hypotheses of Theorem 1, as N — >■ oo, the Shannon transform 
o/Bjv satisfies 

E{V B ^ 2 )} - Vv(tf 2 ) = O (VL) , (18) 



L 1 „ g det(5^) + if;iogdet(*^)-^E^(<, 2 )e- i , t (^), (19a) 
_ logdet _LJ_ + -Elogdet(JLJ-j--E 



" iV^ jfe (<x 2 )ej, fe (<7 2 ). (19b) 



Furthermore, i/X/^'s are Gaussian, we have, as N — > oo, 

N (E{V Bj > 2 )} - Vv(a 2 )) = O (IV (20) 

Proof: By (16) in Theorem 1 together with the dominated convergence theorem, (18) is obtained. 
Then, we show that (i - ^tr(*(w))) can be written more explicitly as (19a). The details of the 
proof are similar to those in [19, Theorem 3], and thus omitted. Since det (I + AB) = det (I + BA), we 
then have (19b). On the other hand, (20) can be obtained by (17) in Theorem 1. □ 

Remark 3 With (18), we can get the deterministic equivalent of the ergodic sum rate regarding the number 
of nats per antenna. However, (20) shows the convergence regarding the total ergodic sum rate and as a 
consequence has a wider range of applications for the performance evaluation criteria. 

Over the last few years, there have been quite many deterministic equivalent results obtained by using 
large dimensional RMT (e.g., [6, 10-16, 18, 19]). Since our model is fairly general, Theorem 2 may be 
interpreted as a unified formula that encompasses many such results. For the case with K = 1 and H = 0, 
Vat(cj 2 ) agrees with that in [18, Theorem 2], in which {X^i}y; are assumed to be Gaussian. Theorem 2 
thus extends its application to the non-Gaussian scenarios in this sense. Indeed, if H = 0, (19) was first 
presented in [33, (23)], where the replica method was used. Also, for the case with K = 2, L = 1, and 
{R^jt = R}vfe, Theorem 2 is consistent with the results in [34] by the replica method which is however 
mathematically incomplete. In contrast, Theorem 2 is not only mathematically rigorous but also more 
general than the proposition in [33] in the sense that H/0 and there is no requirement on the Gaussian 
distribution on the entries of Finally, if rif. = 1 and N[ = 1 for all k, I, then Theorem 2 degenerates 
to that in [12] (or [11] without the LOS components). Clearly, in contrast with [11, 12], Theorem 2 allows 
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the UEs and each antenna set of the BS to be equipped with multiple spatially correlated antennas. 

As mentioned before, deterministic equivalent results together with optimization approaches have found 
numerous applications in system optimization designs [9, 23-25]. For example, based on the deterministic 
equivalent result of [11], the authors of [9] devised an algorithm to compute the ergodic sum rate subject to a 
general fairness criterion. Also, based on [11], the authors of [23] derived an analytical expression of a system 
spectral efficiency when multiple BSs employ joint transmission with linear zero-forcing beamforming. 
They also developed a downlink scheduling scheme under a fairness criterion. Our deterministic equivalent 
results provide a promising foundation to these applications while under the more general large-scale MIMO 
system. In addition, a deterministic equivalent for the SINR at the output of the MMSE receiver can be 
derived using our deterministic equivalent results. Due to space limitations, such applications through 
Theorems 1-2 are left out. In the next subsection, our aim is to answer one of the fundamental questions: 
How should the input covariances be designed so that the ergodic sum rate can be maximized? 

3.2 Ergodic Capacity 

It is well known that the ergodic sum capacity of a MIMO MAC is achieved by selecting proper input 
covariance matrices so that the ergodic sum rate is maximized [35] . In this subsection, we aim to design the 
optimal covariance matrices using the deterministic equivalent results. Firstly, we state that the covariance 
matrices maximizing the deterministic equivalent of the ergodic sum rate yield a result which converges 
to the ergodic capacity. After that, these optimal covariance matrices will be shown to be structurally 
equivalent to an iterative waterfilling procedure over a deterministic channel. Finally, we propose an 
iterative waterfilling algorithm for finding the capacity-achieving input covariance matrices. 

Let Qfc be the input covariance matrix of UEfc which satisfies tr(Q&) < n&. 8 With the input covariance 
matrices Q = diag (Qi, . . . , Qk), we thus write the ergodic sum rate of the large-scale MIMO MAC as 




(21) 



Then, the ergodic capacity under the power constraint is given by 




8 The power constraint can be replaced by tr(Qfc) < Pkrik with Pt being any finite positive value independent from the 
matrix dimension. Note that the current setting tr(Qfc) < n& is for notational brevity only. 
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where 

Qk tr(Q fc ) < n k and Q fc h oj 



h = 



is the feasible set of Qfc. The problem (22) is convex and can be solved using stochastic programming 
based on convex optimization with Monte-Carlo methods [36]. Specifically, we can apply the method in 
[37] (called the Vu-Paulraj algorithm), which was developed based on the barrier method [36, Chap. 11] 
where the related gradient and Hessian are approximated by Monte-Carlo methods. Since Qfc is a Hermitian 
matrix of size n k x n k , the optimization involves n k real entries on the diagonal and n^in^ — l)/2 complex 
entries in the upper triangle. The complexity of such algorithm is high and requires long execution time. 
We thus propose an approximate approach using the deterministic equivalent results in Theorem 2. 

In Theorem 2, we have shown that the deterministic equivalent results are invariant to the type of 
fading distribution. As a result, the asymptotic optimal input covariances, which are designed based 
on the deterministic equivalent results, are also invariant to the type of fading distribution. To get the 
deterministic equivalent of E {Vb jv (c 2 , Qi, . . . , Qk)}, the effect of Qfc has to be included in Vjv(<7 2 )- With 
Theorem 2, this can be easily accomplished by the following replacements: for 1 < I < L, 

Ti, k := QfT ;i fcQ|, and := H^Qf . (23) 

Now, let Vtv(<7 2 , Qi, . . . , Qk) be the result obtained from Vn{& 2 ) with T;^ and H;^ based on the above 
replacements. Then, (19b) becomes 



V N (a 2 , Qi, ... , Q K ) = ^ logdet (I n + FQ) + 1 J^logdet ( ^ff - ) - ^ ^i\^, fe (<7 2 )e^(<7 2 ), (24) 

i=i \ a / lk 

where 

F = diag^|^/3 z ,fce / ,fc(a 2 )T ii fc| J + H H *(a 2 )H. (25) 

i i 

Note that Qfc's appear in ej^cr 2 )^, i.e., ei^iyj) = ^tr(Q|T; fcQ| (4*(a/)) fc ) and thus are involved in all the 
three terms of (24). Using the deterministic equivalent result, we have the optimization problem: 



max V N (cr 2 ,Qi, - ■ ■ ,Qk)- (26) 

Q fc eQ fc ,Vfc 

Before solving the above problem, two important issues must be resolved. One is to establish the 
concavity of Vat(ct 2 , Qi, . . . , Qk) with respect to (Qi, . . . , Qk), and the other one is to ensure that 
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E{Vb ]V (o- 2 , Ql, ■ ■ ■ , Q°k)} - Vat((J 2 , Q^, . . . , Q^) goes asymptotically to zero, where, (Q°,...,Q^) and 
(Q*, . . . , Q^-) are the maximizers of (22) and (26), respectively. The required results are described by the 
following proposition. 

Proposition 1 We have: 

1. The function (Qi, . . . , Qa') i— ^ Va^ct 2 , Qi, . . . , Qa) is strictly concave on (Qi, . . . , Qa)- 

2. In addition to Assumption 2, suppose that Q£ 's and Q£ 's lay within a set of positive semi- definite 
matrices with bounded spectral norm. The, we have 

E{V Bj > 2 , QI,.., Q° K )} - V N (a 2 , Q*,...,Q* K ) = ( -±=) . (27) 



Furthermore, if~K[^'s are Gaussian, then (27) becomes O (j^z)- 

Proof: The proof is similar to that in [15, Theorem 4 and Proposition 3] and [18, Theorem 3 and 
Proposition 4], and therefore omitted. □ 

So far, we have stated that (Q^, . . . , Q* K ) yield a result which converges to the ergodic capacity. Next, 
by using tools from convex optimization [36], we will gain a better understanding on the structure of 
(Q*, . ..,Qjjj-). In particular, our next proposition will state that the optimal covariance matrices are 
structurally equivalent to an iterative waterfilling procedure over a deterministic equivalent channel. 

To that end, we start with defining the Lagrangians of the optimization problem (26) as 

A A 

£(Q,T,/i) = -V N (o- 2 ,Q 1 ,...,Q K ) + J2*(TkQk) + Y,Vk(n k -tr(Q k )), (28) 

k=l k=l 

where T = {Y k }y k and (j, = {/J, k }\/ k are the Lagrange multipliers associated with the problem constraints. 
In order to express the partial derivative of Vat(<t 2 , Qi, . . . , Qa) with respect to Q k , i.e. we define 

X(a 2 , Qi, . . . , Qa") = jj log det (I n + FQ). From (24), it is noted that the parameters affected by the 
perturbation of Q k are X(a 2 , Qi, . . . , Qa), {^i,k}\n,k, and {ei t k}\/i,k- As a result, we have 

dV N _ dV N dX v - dV N de^ k x ^ dV N de hk 

0Q k ~ dX 8Q k ^de l>k dQ k j^8e ljk dQ k [ ' 

It can be checked that = and = 0, V/, k. Therefore, the Karush-Kuhn- Tucker (KKT) conditions 
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of (26) are 



- l((In + FQ)" 1 F) k + T k - Mfc I nfc = 0, 



N 

tr(T fc Q fc ) = 0, T k h0, QkhO, (30) 
^ {nk ~ tr(Qfc)) = 0, /j, k > 0, 
for k = l,...,K. 

Since (26) is a convex optimization problem with constraints satisfying Slater's condition, the optimal 
Qfc's can be found by solving the KKT conditions [36]. Using Lemma 14, the first line of (30) can be 
rewritten as 

~ (I nk + PfcQfc)- 1 P fc + T k - ti k In k = 0, (31) 

where 

P fc ^((l n + FQ\,)- 1 F) fc , (32) 
Q\fc -diag (Qi, . . . , Qfc-i, 0, Q fc+ i, . . . , Q K ) . (33) 

Note that P& is a function of (Qi, . . . , Qk) rather than only Q\£, as F defined in (25) includes the whole 
Qfe's. For brevity, we have omitted its argument when writing P^. Substituting (31) for the first line of 
(30) , the KKT conditions (30) are now equivalent to those of the following optimization problem: 

^ na ^ log det ( In k + p fcQfc), ( 34 ) 
which can be solved by a standard iterative waterfilling procedure. Thus, we get the next proposition. 

Proposition 2 Let P£ be the matrix in (32) by replacing (Qi, . . . , Qfc, . . . Qk) with (Qi, . . . , Q£, . . . Qk) 

and P£ = Vp fc Ap fc Up fc . The eigenvectors of Q^ coincide with the right singular vectors of matrix P£ ; i.e., 

Qt = V Pk A* Qk U» k , (35) 

and the eigenvalues are given by 

^(^"'-^T' (36) 

where (a) + = max{0, a} and fi k is chosen to satisfy the power constraints tr(Q£) = n k . 
Using Proposition 2, we have the following observations: 
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• H = — In this case, P/t = Yli=i ^i,k^i,k(o' 2 ) r ^i,k- Therefore, the optimal transmit directions align 



with the eigenvectors of some weighted sum of T^'s. As such, /3ik e lk(°~ 2 ) can be understood as the 
equivalent channel gain contributed by BS;. 

• H = H — This implies that the channels are deterministic. In this case, 



It shows that the optimal input covariance matrix of each user follows the water-filling principle that 
treats the other users as noise. This characteristics agrees with that for finite-size systems [35]. 

• K = 1 — In this case, we have 



If {R-z i = Ijv ; }v/, the optimal transmit directions thus align with the eigenvectors of some weighted 
sum of Tj i's and H^H/ i's. While if R; i 7^ Ijv p the impact of R; 1 on the optimal transmit directions 
is involved by H^i via (e;.iR«,i + Ijv,) -1 H; l It appears that if the link pair does not have LOS, 
the corresponding correlation pattern at the receiver side does not provide a "direct" impact on the 
structure of the optimal transmit directions. Nevertheless, this inference is not entirely true, since 
the optimal transmit directions still can be changed by the correlation pattern at the receiver side 
through We will illustrate this phenomenon by an example in the simulation results. 

Through the observations above, Proposition 2 shows its potential in understanding the impact of 
antenna correlations and LOS components on the structure of the optimal transmit directions. We 
now introduce an iterative algorithm for optimizing Vat(ct 2 , Qi, . . . , Qa') which adapts parameters Q and 
{ei,khi,k, {ei,kh/l,k separately. 

Algorithm 1 (Optimization for Q) 

• Initialization: = l nk , = 1 and efl = 1 for k = 1, . . . , K and k = 1,...,K. 




(37) 




1=1 



• Iteration t: 



Given that Q 



» e lk l) and 4,k 1 



are available, for I = 1, . . . , L and k = 1, . . . , K; 
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Calculate and by the replacements of (23) for I = 1, . . . , L and k = 1, . . . , K, Then, 
{effc}v/,fc,{effc}v«,fc are obtained by 



where 



,(*) 



trfB <>fc ((*(*- 1 )» i y 



(t-l)x 



— tr T, ifc (* v ')*)> 




— Calculate Pj^ frased on (32), /or fe = 1, . . . , K; 

— Calculate based on Proposition 2, for k = 1, . . . , K. 



Update t := t + 1 ?miiZ 



V^ 2 , Q?\ . . . , Q?) - V^(a 2 , Qt 1] » • • • , Qf 1] ' 



is small enough. 



A similar iteration procedure was adopted by [34]. For the case with K = L = 1 and H = 0, the 
convergence of Algorithm 1 has been proved in [34] . Note that Algorithm 1 is slightly different from those 
in [15,16,18,38,39], named the frozen water-filling. For the frozen water-filling, {e|^}v;,/t, {ef^}vz,fc are 
defined as the unique solutions of (14) at every iteration step t, while in Algorithm 1, {eft}v7,fe) {ej fl}\/i,k 
are obtained by performing a single update. It was pointed out in [34] that the frozen water-filling algorithm 
does not always converge. 9 The convergence proof of Algorithm 1 is still an open challenge now. 



4 Simulation Results 

In this section, computer simulations are conducted to evaluate the accuracy of the approximation Vat(o~ 2 ) 
in Theorem 2, and the effectiveness of the iterative algorithm developed in Algorithm 1. In particular, we 

are interested in their performances when the numbers of antennas are not so large. The simulation settings 

9 Note that an example of oscillating behavior of the frozen water-filling algorithm is artificially constructed in [34]. However, 
there is no known condition (e.g., spatial correlation pattern) to exclude such behavior of the frozen water-filling algorithm. 
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l/(T 2 (dB) 
(a) 




-10 -5 5 10 15 20 25 



l/<7 2 (dB) 
(b) 

Figure 2: Ergodic sum rate versus SNRs with N% = N2 = n± = ri2 = 2 and N\ = N2 = n\ = n-2 = 8 for 
a ) { K i,k = 0, VZ, k} and b) {ki /- = 1, V/, k}. The solid lines plot the deterministic equivalent results, while 
the markers plot the Monte-Carlo simulation results under different different fading distributions. 
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Table 1: Angular parameters. 
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are based on the propagation model introduced in [40], in which the spatial correlation is generated from 
a uniform linear array with half wavelength spacing in a wireless scenario where there is one propagation 
path cluster with Gaussian power azimuthal distribution having mean angle of 9k i and root-mean-square 
spread of 6k Specifically, we take the correlation matrix with elements [10] 



,/c \m,n 



(or [Kl t k]m,r 



180 



180 



j7r(m-n)sin(^)- 



:,fc) 
2S h 



(38) 



with m, n being the indices of antennas. In addition, we use the superscripts T and R, respectively, to refer 
to the corresponding values at the transmit and receive sides. The LOS matrix H; & is generated according 
to H i>fc = 3L R j{9K k )a Tik (9j k ) H where 



a R /(0 R 



l,k) 



1 e 



J7rsln , -±- n 



1 e 



jn(Ni — 1) sin 



1 T 



Regarding the fading distribution, we assume that xf^ is of the form wj[fj cos(0p^- ) +jWj sin(0^^ 



[41], where #r^-'s (and flj'^'s) are the phases modeled as i.i.d. uniform random variables over [0,27r], and 



M,k), 



those Wj^fj's (and Wj^'s) are the amplitude fading drawn from a distribution with E{(W^'^')' 2 } = 1. The 
typical probability distributions of include the Rayleigh, Nakagami, and log-normal distributions 

[20,21]. Throughout this section, all the expected values (e.g., E{Vb jv (c 2 )}) are obtained by the Monte- 
Carlo method in which 10, 000 independent realizations of H are used for averaging. 

In Theorem 2, we have shown that in the large-system limit the ergodic sum rate is invariant in 
distribution and can be well approximated by Vat (a 2 ). Therefore, it is important to see how well Vn{& 2 ) 



r (i,k), 
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Table 2: Average execution time in seconds. 





L = 10, K = 20 


L = 20, K = 40 


L = 30, K = 60 


Monte-Carlo simulation 


490 


1941 


4541 


Deterministic approximation 


0.5 


3.2 


8.2 



in (19) approximates to the ergodic sum rate E{Vb jv (c 2 )} when the dimensions of the system are not 
so large. For this purpose, Figure 2 compares the results of E {Vb n (& 2 )} with Vtv(<7 2 ) for K = 2 and 
L = 2 under different fading distributions. Their mean arrival/departure angles and angular spreads are 
given in Table 1 and their distance-dependent pathlosses are g\ \ = 52,2 = 1 and gi^ = 52,1 = 0.25. We 
see that VW(<7 2 ) produces very good estimates for E {V~B N (cr 2 )} even when only a few antenna elements 
(e.g., N\ = N2 = n\ = ri2 = 2) are located at each UE and antenna set. As expected, when the number 
of antennas grows large (e.g., N\ = N2 = n\ = n,2 = 8) all curves tend to overlap regardless of the 
distributions. In addition, we notice that for the Nakagami-m distribution, the difference between the case 
m = 0.5 and m = 10 is small even when there are only a few antenna elements. 

In the above experiments, we have shown that Vtv(<t 2 ) provides a very good approximation for the sum 
rate of finite-dimensional systems. Before proceeding, it it useful to discuss the computational efficiency 
of evaluating E{Vb jv (o' 2 )} through Vat(<7 2 ). For the considered scenarios in Figure 2 with K = 2 and 
L = 2, the execution time for evaluating E{Vb jv (<7 2 )} is at the order of decasecond (i.e., 10 1 seconds). 
Although the execution time for Vat(ct 2 ) is only at the order of centisecond (i.e., 10 -2 seconds), one may 
not be convinced to use Vtv(<t 2 ) since writing a program to perform E {Vb jv (o' 2 )} is much easier than that 
for Vtv(<t 2 ). However, when the numbers of K and L grow, the Monte-Carlo simulations will become very 
demanding. Table 2 gives the average execution times on a 2.93 GHz Intel CPU with 4 GB of RAM under 
various system sizes. Here, we set {N[ = = 2}v;,fc, and the spatial correlation and LOS are generated 
from an arbitrary pattern. For typical systems with twenties of distributed antenna sets and forties of 
users, the simulations become prohibitive, ruling out the possibility for other system optimization designs 
such as scheduling [9,23]. Clearly, the proposed deterministic equivalent result is much more efficient in 
this sense and provides a promising foundation to further applications of system optimization. 

Next, we examine if the input covariance design based on the deterministic equivalent results performs 
well under different fading distributions when the numbers of antennas are not so large. Recall that 
{Q°, . . . , Q° K } denote the optimal solutions of (22) that maximize the ergodic sum rate; and {Q*, • • • , Qj^} 
denote the optimal solutions of (26) that maximize the deterministic equivalent of the ergodic sum rate. 
Algorithm 1 is used for solving {Q^, . . . , Q^-}, while, {Q\, ■ ■ ■ , Q°k} is solved by the Vu-Paulraj algorithm 
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Figure 3: Achievable rates versus SNRs with N± = N2 = n\ = 712 = 2 for a) {ki^ = 0, VZ, A;} and b) 
{ K i,k = 1 5 VI, k}. The lines plot the results based on the deterministic equivalent, while the markers on 
dotted line plot the results for the Vu-Paulraj algorithm. 
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(0 

Figure 4: Antenna radiation patterns. 



[37] which is based on the barrier method where the ergodic sum rate and their first and second derivatives 
are calculated by the Monte-Carlo method. In contrast to {QJ, . . . , Q^-}, {Q*, • • • > Qa'} * s independent 
from the true distributions of X^'s. In Figure 3, we depict E{Vb n (& 2 , Qi 5 • • • , Qk)} when the input 
covariance matrices are {Q\, ■ ■ ■ , Qa-}> {Qi> ■ • • j Qa'}' an d identity matrices, when the amplitude fading 
distributions are either Rayleigh or log-normal. The reason for considering the two distributions is because, 
from Figure 2, the values of E {Vb n (& 2 )} for the two distributions are significantly different and Vtv(o" 2 ) 
does not get very good estimation on E{Vb jv (c 2 )} when the amplitude fading distribution is log-normal. 
However, regardless of Rayleigh or log-normal distributions, the ergodic sum rate based on {Q^, . . . , QaI 
provides indistinguishable results to that based on {QJ, . . . , Q^}- I n addition to its ability of providing 
good performance, Algorithm 1 is computationally much more efficient than the Vu-Paulraj algorithm. 

Finally, we discuss the fact mentioned in Section 3.2 that the optimal transmit directions can be changed 
by the correlation pattern at the receiver side through fii\e.\\. To understand this better, we consider two 
scenarios with K = 1, L = 2 and H = 0. The two scenarios use the same parameters except that the 
radiation patterns at the receiver of the second antenna set have different beam-widthes. Specifically, the 
radiation patterns at the receiver of the second antenna set have &2i = f° r scenar i° 

1 and = 0.1 

for scenario 2. We find it useful to observe the array patterns by plotting its array factor 10 in all directions. 

10 Consider a uniform linear array with half wavelength spacing. Given a vector a £ C nxl , we can get its array factor in 
direction 4> by 

n 

f(<t>) = Ys a * e ~ i7Vlaia ^' ) - 
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The array patterns of Tx 1 and T2 1 are depicted in Figure 4(a), where 9j 1 = 30°, 0j x = 60°, 5j 1 = 0.04, 
and 5j 1 = 0.03. The array patterns of the optimal input covariance Qi for the two scenarios are given in 
Figure 4(b) and (c), respectively. Figure 4(c) corresponds to the setting with the broader beamwidth of 
R.2,1- I n this case, the optimal covariance is shown to feed the signal largely according to T2,i, showing 
that the optimal transmit directions can be changed by the correlation pattern at the receiver side. 

5 Conclusion 

By using the large dimensional RMT, this paper investigated the deterministic equivalents for the large- 
scale MIMO MAC. The considered model includes the large-scale MIMO channel such as the general 
spatial correlation, the LOS components, and the channel entries being non-Gaussian. In particular, we 
derived the deterministic equivalent of the ergodic sum rate of the large-scale MIMO MAC. In addition, 
through the deterministic equivalent of the ergodic sum rate, we investigated the capacity-achieving input 
covariance matrices for the the large-scale MIMO MAC and proposed the iterative waterfilling algorithm 
for finding them. Finally, computer simulations were conducted to conclude the following three facts: 
First, the deterministic equivalent of the ergodic sum rate provides a very good approximation even when 
the numbers of antennas are of practical size. Second, calculating the ergodic sum rate by using the 
deterministic equivalent result is much more efficient than that by using the Monte-Carlo method when 
the system sizes are large. Hence, the deterministic equivalent result is of interest to addressing complex 
system optimization problems. Third, the optimal input covariance matrices predicted by the deterministic 
equivalent result are indeed remarkably close to those obtained by the corresponding finite-dimensional 
optimization approach, but in a much more efficient manner. 

Investigation of the central limit theorem of the sum rate for the large-scale MIMO MAC by using the 
mathematical framework in [13], as well as application of the deterministic equivalent results to system-level 
designs [9,23-25], are promising topics for future research. 
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Appendix 

A Proof of E{m BN } - ^tr (*) = O (^) in Theorem 1 

We start the proof by reformulating the channel model so that the derivation can be performed systemat- 
ically. To this end, we denote 

Ef,* - dia g (°JVi, • • • ,Ojvj_i)R-j,*;,Ojv z+ i, ■ • • ,0n l ) , (39) 
4 diag (0 ni , . . . , nfe , T I)fc , nfe+1 , . . . , nK ) . (40) 

Also, let k be the all-zero N x n matrix except that is used for its (X)j=i + 1) to (Yli=i Ni)-th 
row and (X^j=i n j + 1) to (X^=i ra j)-th column. As a result, H is statistically equivalent to 

h = = J2 (&,* +&,*) . (4i) 



where H Zfc = R l %X J)fc T? fc G C^", and X,, fc = 



G C JVxn consists of the random components 



of the channel. Here, X; k s are assumed to be mutually independent. From (6), we have 



li 



B N = ( £ (b^T^ + fly,) j |£ (sJOWEfik + Si,*) j (42) 

and 

B ^ = (eJ AfcX i + S >fc ) j f £ + j . (43) 

where Xj fc 's and fc 's are matrices with entries satisfying Assumption 1 but Xj fc 's are Gaussian. 
Let S and S be the resolvents of matrices HH fl and H^H, respectively, given by 

S± (HU H + loIm)' 1 , (44) 
S = (H ff H + wl n ) _1 . (45) 

These resolvents clearly satisfy the following useful properties: 

S H -I^, and <S ^ (46) 

To facilitate our notations, we use a to denote the zero- mean random variable a — E{a}, where a is a 
random variable. To accomplish the proof, the following two lemmas are useful. 

Lemma 1 (Integration by Parts Formula for Gaussian Functionals) (see, e.g., [27, Proposition 2.4]) Let 
£ = [£i, . . . ,£a/] t be a complex Gaussian random vector such that E{£} = and E{££ } = ft. Denoting 
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by r(£) a complex function polynomially bounded with its derivatives, we have 

Al 



E{^r(0} = E n p- E {^} 

m=l L ?m J 



(47) 



Lemma 2 (The Poincare-Nash Inequality) (see, e.g., [27, Proposition 2.5]) Let £ = [£i, . . . , £a/] t &e a 
complex Gaussian random vector such that E{£} = and E{££ H } = ft. Denoting by r(£) a complex 
function polynomially bounded with its derivatives, the following inequality holds true: 



var (T(0) < e {( V5 r(0) T o (v 5 r(£))* } + e {( V€ -r(0) H n (v € -r«))} 



(48) 



where V^(^) 



r 



and Ve r (^) 



<9r dr 

ser ' • • • ' aei 



The rigorous proof of Theorem 1 is rather complex. Although a standard procedure for the MIMO 
channel without the LOS components [18] is used, several additional manipulations for the LOS components 
to our present argument are required. To show this, we split the proof into two steps: First, we prove that 
tr(E{«S} — \&) 0; secondly, we refine the convergence rate that -±> (tr(E{«S} - *)) = O (^). However, 
it is difficult to prove directly that tr (E{«S} — — > 0. To that end, we employ an intermediate quantity 
between E{<S} and \I/ and establish the following two propositions. 



Proposition 3 As M — > oo, we have 



tr(E{5} - 3) > 0, 
tr(E{5} - 3) > 0, 



(49a) 
(49b) 



where 



oj Ijv + diag 



^ &l,k^l,k > 
.fc=i J Ml) 



+ H0H 



H 



In + diagN£)a, )fc T, ifc l iH^H 



i=i 



Vfc' 



e^dia g (e 1 ,...,© L ), 

©4diag(0i,...,0^), 

K 

0,4 ' 



0i 



Co' 



V fe=l / 

I + ai tk Ti t ^\ 
\ i=i J 



, forl = l,...,L, 



, fork = l,...,K, 



a l)k ± — tr(R 4)fc E{((5»,}), 
a,, fe ^ -tr(T^E{(«S)J). 



(50a) 

(50b) 

(50c) 
(50d) 

(50e) 

(50f) 

(50g) 
(50h) 
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Proof: See Appendix A.l. 



Proposition 4 As J\f — > oo, we have 

tr(S - *) — ► 0, 
tr(S - *) — ► 0. 

Proof: See Appendix A. 2. 
From (49a) to (51a), the proof of E{m^ N } — -^tr — > can be accomplished. 

Proposition 5 As M — > oo, we have 



>(E{*} 




= o\ 


K N 2 


>(E{<S} 




= o\ 






-*) 


= o\ 






-*) 


= o\ 





Proof: See Appendix A. 3. 
Consequently, (17) then follows from (52a) and (52c). The proof is complete. 

A.l Proof of Proposition 3 

From (41) and (44), we have 



U U 0J id 

<i,fci l,k 



and 



E{S pq }=-S pq --E{[SUH H ] pq } 

id 0J 

id id — id 

l,k Ij^h kytki 

We first calculate E{ [«SHH^] pg }. Using the integration by parts formula (47), we write 
^{s P iEff^>} =E{ Spi Hff^>} + E{S P1 ^}^ + E{S PI }^^> 

= —X^ T>( l >k) T {l,k)*r; f dSpjHgf * 1 1 T>(l,k)* T (l,k) F f dS pi 

„ / j —im —in "-l ~ (I k)* I <n 7_j±±qm -L-rn L l ~ (I k) 
nk mn [ 8H { ' ' ) Uk mn { 8W ' ' 



-mn ' '"■>" v " ±^-mn 



+ ^{O p i\iLij S-qr > 
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and similarly, 



+ ^ E^^'* 1 ^! T^b}^ + £{S P i}H!f^ M> - (57) 



Now, using the fact that 



Ukl m,n [ 9H_mn 



E — W) = " E {S P m[K H S] ni } , (58a) 
E { ~4|f^ } = -E{S mi [SUU} , (58b) 



E 



-mn 



E \ Spi5q m 5 rn S m i [SH] pn HV; k >} , (58c) 



we have 



and 



i-E{[5R Zjfc ] pg [T Z)fc H ff 5] ri }£g' fe) + E{^}fl^flg!*> (59) 



nk 



- E{[5R ilifc J P9 [T ilifcl H^] r J£;f ) + E{5 p j£g' fc) 4^ fcl) *. (60) 



Then, summing over i, we have 



^Jp^H = -T { ;; k >E{[SR ltk ] pq } - -E{tr(R ltk S)[SH^ k ] P1 ^ k >\ 



^E{[5R Zifc ] M [T, !fc H s 5H Zifc ] r ,} + E{[Sa ifc ] ra }^ fc) *, (61) 



and 



ijlSHy]^} = -^E{tr(R iifc 5)[5HT ijfc ] PJ 4^^)*} 

1 -E{[5R Jlifcl UHj lifcl H H 5a, fe ]ri5 pm } + E{[SH^}4^*. (62) 
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Let a i,k = ^tr(R iifc E{5}) = £tr(R, )fc E{((S)),}) and ° %k ± ^tr(R, jfc S) - a l)k . Then, we get 

= l.Tf; k> mm,kU} -«^E{[5HT Zifc ] p ^ fc )*} - e{^ [SHT^]^*)* } 

- -E{[5R iife ] P9 [T, > ,H H 5a ifc ] r ,} + E{[SR ltk ] PJ }H^ k> > (63) 



and 

EjpH^fl^**} = -^{[SHT^H^^} - e{^ [SHT^Hfy**} 

- ^-E{[«SR hjfcl U[T hifcl H^a ifc ] r ,} + E{[SK l>k ] pj }H^' kl)m . (64) 

From (63) and (64), we obtain 

E{[SH] pj H* qr } = £ ±- k T { ;; k) *E{[S^ k ] pq } — ^2 a «,fcE{ [StlT lk \ P jH* r ] 

/ . k I . k 

- ]T [SHT^ k } PJ H* qr } - ]T i-Ej^^U^^H^SH]^} + E{[SH] pi }iT* r . (65) 

Denning 



l,k 







Co' 



--l,k 

l,k 



-1 



diag 0i,..., @k), (66) 



where 0^ is given by (50f). Multiplying both sides of (65) by [& k ]j r and summing over j and r, we get 

E{[SUU H ] pg } =cuJ2^&,k®M[S^,k] Pq }-uJ2 E {%k PHT,,*©H H ] M } 

' E {tr(T iifc H^5H0)[5R /ife ] pg } + uE { [5H0H ff ] M } . (67) 



UJ 

Lk 
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This, together with (54), yields 

E {S} =-I N - ^ -^tr(T^e)E{SR^} + £ Eft )jt 5HT ( ,9H H } 

l,k l,k 

+ £ -E{tr(T iifc H H 5H0)5R^} - e{5H0H^} 

~ E ^tr(T^0)E{«SR iifc } + ^ E{^ fc <SHTj fc 0H H j 

«,fc Z,fc 

+ E ^E{tr(5H0T ijfe (H + H)*)^} - e{<SH0H^} 

i,k k 

=7; lN ~ E ^tr(T, ifc 0)E{5R ijfc } + £ e{° %1c SHT l k &H H } 

K l,k 

+ E E |- tr («5H0T, fc H^) ) E{5R Jjfe } + £ e{$ <SR,, 

+ E E |- tr («5H0T^H^) } EjSR^} + E (pS " e{«SH0H"}, (68) 

l,k ^ nk > l,k ^ J 

where the third equality follows from the following definitions 

4 J_ tr to fc H fl ) , pg 4 J_tr (SU&T^U H ) . (69a) 

Before proceeding, we establish the following lemma. 
Lemma 3 

E {-tr(sH0T^H") 1 = - cj £ a hM e{ -tr(5H0T ijfe 0T ilifel H^) 1 - w £ E^ fcl ^ lfc l 



where 



p2U = ^tr(5H0T iifc 0T ilifcl H^j . (71) 



(70) 



Proof: Using the integration by parts formula (47), we write 



E 



h,ki 



dS pi 



h,ki 1 m,n i Oil, 

~ E ^-E{[5R ilifcl ] pp [T, iifci H^]„}[H0T Zjfc ], i . (72) 



29 



Summing over p, we have 

E | [H0Tj )Jfc ] jj [H H «S] „ | = - £ — pH©T, ifc ] ii E{tr(5R ilifcl )a iifcl H H 5] ri } 

li,fc! fcl 

= -J2 a llikl [neT^] tJ E{\R llM H H S] rt } 
Y [H@T^ k ]ij \T Jl ki iL H S] r i | 



ii.fei 



2 ^pH©T,,*]iiE{tr(5R ilifcl )pC Ilifcl H H 5] r i}. (73) 



Z,,fc 



After simple algebraic operations and summing over i,j and /, we then get 
E{tr (sflQT^H*)} = -w E E{^ 1)fcl tr (OTeTj^^H^ 



ii,fci 

— a; 



E ^Eltr^T^.Jtr^HGT^GT^^H^)}. (74) 

Therefore, we have 



h,ki 



E {i tr (^S,*^) } = E «^ E {^ tr («SH0T, )fe 0T Jl)fel H^) } - ^ E E { V*i Pllik^ 

h,ki d,fci 

(75) 

where p® lfcl is g iven b Y ( 71 )- Usin g the fact that E {^i,fci Pil]hki } = E |^i,fciftfckfci }> we obtain (70). □ 
Applying this lemma to (68), we get 

E{S} = -Iat-E -tr(T i , fc 0)E{ < SR ijfc } + £ E{^ fc 5HT ijfc 9H H } 

w i,* nk i,k 

+ E E{^tr(5H0T, )fc H ff ) } E {5R,,J - - EE E {^^C^} E {^} 

l,k ^ k > l,h k,ki J 

L K , . 

-^EE a ^ E -^(^H0T ijfc 0T Zlifcl H ff ) E{5R ijfc } 

l,h fc.fci ^ fc ^ 

+ Y, e{p$ ^} + E E {^S " E {^H0H^}. (76) 

Lk ^ i,fc ^ ^ 



Define 

L if 



A — E E {^ ^HT, fc 0H^} - - EE E {^a- 1 pS 1 fc 1 } E («5R,fc} 

/,fe l,h Mi 

+ x; e{pS <sR,fcj + e e {^ ^4- ( 77 ) 
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Noting that 



L K , . 

^EE a ^ E -tr(«SH0T i5 ,0T, i>fci H^) E{«SR^} 

= E E { ^t r («5HG)T iifc H^) }E{5R iifc } - u, £ E{^-tr(5H0T i| ,0H i? ) }e{5R^}, (78) 
we therefore get 

E{S} = -I N - -tr(T^0)E{5R^J + W ^E(-tr(5H0T Jifc 0H^) |e{«SR^} 



- e|«SH0H h | + A. (79) 

Writing 

rz, fc ^ -tr(T, fe 0) - a;— tr(E{S}H0T, fc 0H H ) = -trfc.0 (l n - W H^E{S}H0) Y (80) 



n fc n fc v / n fc 

we have 



E{S} (ijy + HeH F ) = -I* - £ r fc E{5}R^ + A, (81) 



and then 

~(&l,k - ^,fc)E{5}Rj i. + A. (82) 



E{<S} ( Ijv + ^ fi^R^ + H6H ff J = -I N + 

\ l,k J l,k 



As a result, we then get 

E{S} = S + W ^(a i>fc -f iii )E{5}R i ^3 + W AS, (83) 

where 

IiV + ^a ; , fe R^ + H0H H 1 (84) 



l.k 



and 



d/, fc ^^-tr(T^E{5}) =^-tr(T, 



To get Proposition 3, it remains to show that am — 77 j. — > and tr(AH) — > 0. To that end, we have to get 
a similar expression of E{«S} as that of (83). Following the same derivation of (83) from the beginning, we 
can get 

E{S} = 3 + lj Y,( a i,k ~ r ltk )E{S}T l)k S + co AS, (86) 
i,k 
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where 

( o } ^ ^ ( o o (3) ^1 

A 4 £ E °%k «5H^R, jfc 0H - ^ E E E tiuiJwxK E {^k} 
i,k ' fc.fei ^ 

+ E E ST,,*} + E E [hi ST t A , (87a) 
r/, fc ^ J-tr^e (ijv - W HE{5}H H 0)), (87b) 

- — tr fl!z,fc^) - ( 87c ) 
$ 4 i_tr (SH^R^fi) , 4 Itr (sH^R^h) , (87d) 

*£U = ^ (^©R^OR^h) , (87e) 

and those 3 and are given by (50b) and (50c) respectively. 
From (50h), (80), (83) and (86), write 

ai, k = ^tr(T^H) + J(oy - n^tr^E^T^s) + ^tr^As) (88) 

Tlf~ \ / ylfe . . \ / ylfe \ / 

hi 

and 

*U =— tr (T^© (l n - W H H Efie)) - — J>m " ^)tr (t^QH^E {5} R^-SH©) 

fife V \ / / fife . . V / 

2 

- — tr (T (ifc GH H A3HG) 
=— tr(T,, fc s) - — E(^,i " T i , i )tr(T iifc 0H^E{ < S}R M .HH©) - — tr^SH* ASH©) 

Tlfe \ / fife . . \ / Tife \ / 

=^ ~ Z~ E( a M " ^(T^iSjT^s) - ^tr(T Zifc A3) 

Tlfe . . V / Tlfe V / 

2 2 

vEfe-^^fe^^ 5 }^ 3 ™) - ^tr(T ijfc 0H"AHH©). (89) 

Ttfe \ / ?7,fe \ / 

Similarly, 

ai,k = — tr (Rj fc S) + — E(«iJ " ^^(R^E^jR -H) + — tr (R^AH) , (90) 
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and 



Ti, k = ai,k ~ ^j) tr (Sl,* E { 5 }SijS) tr (R/ ifc AE) 

T^k . ■ 



i -J 



^ " T ij )tr(R Zifc 0HE{«S}X t , J 3H^0) - ^trfR^QHAHH^e) . (91) 



2 

-1 



Let 77= [vec(Ai) T ,vec(A 2 ) T ] T , e= [vec(Ci) T , vec(C 2 ) 
C^^rii,^,^!^^ G C^X^, With 



Til Ti2 

r 22 



, where Ai,A 2 ,Ci,C2 G 



[Ai]i, fc 


= &l,k ~ 


[Ci]i,fc 


UJ 


= — tr 


nk 


[p2]/,fc 


UJ 


= — tr 


Uk 







ll\lk,ij 



I2\lk,ij 



22\lk,ij 



ii k 



[A-2]l,k = «/,fe - T l,k, 

uj 2 

— 1 

nk 

uj 2 
— t 

nk 



tr T U 0H^E{5}R M 3H0 , ± (l,k); 



in 



1 - ^tr(T^0H^E{5}R iifc HH0), = (l,k), 



UJ 



n k 



— tr T, fc E{5}T M H , [T 21 ] 



^tr(R, jfc E{5}R^S), 



tr(B i , fc eHE{5}T ii ,SH ff ©), / 



1 - gtr(R, ife 0HE{5}T u HH^0), (i, j) = (l,k). 



From (89) and (91), we get 



Tr] = e. 



(92a) 
(92b) 

(92c) 
(92d) 
(92e) 
(92f) 



(93) 



If we can show that e — > and F is invertible, we then get our desired result rj — > 0. To show that e — > 0, 
we establish the following lemma. 



Lemma 4 For any uniformly bounded matrices Q and Q, we /tawe 



-tr(AQ) = o(± 



(94a) 
(94b) 
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Proof: From (77), we write 



i-tr(AQ) = E4^*V W £^ 

k l,k J l,h k,ki } k 

+ E E {^S + «i > )- tr (^.* Q )}' (95) 



where 



p{J = — tr^H^eH^Q). (96) 
n k \ / 



We first prove the following facts for any uniformly bounded matrices M, 



V sr (± tr( SM)) = o(±). (97) 



For this, we let T (H) = ^-tr (5M) which gives 



=-E M ^^% = --[K H SMS] nm , (98a) 



Using Lemma 2 (the Poincare-Nash inequality), we obtain 



Var (^tr(SM)) <E^E E ^^^E U[H^M5] nm ^[H^M5]^ 

Z,fc m,n m' ,n' 

+ E^E E fii'4^ 5M5H ]™^ 5M5H ]«'»'l 

l,k rn,n rn'.n' 



Lk n k 



+ E "4 E ( tr (H H 5M H 5R, )fc 5M5HT, )fc )} . (99) 



(,* n ' fc 



Noting the fact that (using < -j, Lemma 8, and Lemma 11) 

9 T /V 1 1 TV/T 1 1 2 / — ' 2 AT 

E {tr (H^MSR, fc SM"SHT, fc ) } < (100) 

we get 

/ 1 , A 4L 2 if 2 ||M|| 2 C 2 iV / N 

Var(-,r(SM))< LS^S- = (_) . (101) 

It turns out that (97) holds aud thus implies that E {^} = O (£). Similarly, based on the Poincare- 

f o(4) 2 ] 

Nash inequality, we have E< p ik > = O (772)- The Cauchy-Schwarz inequality provides the first term of 
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the right-hand side of (95) which is a (jys) term. Similar calculations allow to show the second and 
third terms of the right-hand side of (95) giving the O (j^z) terms. Therefore, we obtain (94a). Similarly, 
(94b) can be proved and the proof is omitted. □ 
Prom this lemma, it can be shown that e = O (j^r) 1- In addition, we note that 



|5||,||5||,||S||,||a||,||0j||,||©fc|| < -. 



Using (13), Lemma 8 and Lemma 11, we have 



[Til] 



lk,ij 



12\lk,ij 



22\lk,ij 



> 



> 



> 



Ni LKC^ 
n k up ' 

, Ni LKOL 

i- — -77— 



n k 



2l\lk,ij 



> 



(i,j) = (l,k), 



i2 

'max 



m LKC&, 
n k ZP~ 



1 



NiLKCL^ (i,j) = (l,k). 



(102) 



(103a) 
(103b) 
(103c) 



It is possible to choose ojq such that uj > ujq and T is a strictly diagonally dominant. Thus the eigenvalues 
of r are bounded away from [42, Theorem 6.1.10]. It implies that if oj > ujq, then («;,& — 7z,fc)'s and 
(o^i.k — 77,fc)'s are of the same order of magnitude as O {j^z), and therefore converge to when Af — > 00. 

In the remaining part, we aim to prove that this convergence still holds for < w < ujq. Firstly, 
considering oti k and t\ & as functions of the parameter z = —lo £ Mr, we extend their domain of validity 
from R~ to C — IR + . Similarly to [18, Proposition 11], we have the following lemma. 

Lemma 5 a/^ andri^ are analytic overC— K + and belong to S(]R + ) with \ai^\ < n ^ z 



and \ T lM ^ n k ( Kltk+ l)d(z,R+) + \d(z,R+))* ) > Where § ( 

positive measures carried by M + . 



n k {Ki, k +l)d{z,R+) 

is the class of all Stieltjes transforms of finite 



Proof: We only prove the results for oli & since the proof of results of T; & is similar. From the definition 
of S, S is invertible for every z £ C — M + and E{«S} is analytic over C — M + . Thus an & is analytic over 
C — M + . Using the fact that S < d(zM.+) ^ n an< ^ I jemma 8, we have 



\ a l,k\ 



-tr(R ijfc E{«S}) 

nk 



< J_||ELS;iltrR < ^ tr -*' fc - NlPl > k 

- nk \\ X 111 -i,k- d(ZiR+) nk ( K +i) d ( z , R +y 



where the last equality is obtained by (4). In order to state am- G §(K + ), we only check the following 
three conditions by [18, Proposition 10]: 1) 9{a^.(z)} > if ^s{z} > 0; 2) 9{za/ ife (z)} > if > 0; 

3) Hindoo \iyati tk (jy)\ < 00. 
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Let us first compute $s{ai t k(z)}'- For every z G C + , 

3{a, )fc (*)} =9 {— tr (R^E {5 (H H H - z*I N ) S H }) ) 

=9 { i tr ( -^ fcE { 5hHh ^}) } - 3 { ^* tr (^* E 



9{z*}tr(R iife E{«S^}) >0. 



By similar arguments above, we can prove that $t{zcti k(z)} > if Q{z} > 0. Next, we calculate 



lim \]yai t k(jy)\ = lim 

y— >oo y— >-oo 



1 

n fe 



1 

jy 



tr [ R, fc E<( ( -HH fl -Ijv 



-i 



= — trR, fc < oo. 

nk 

Since the three sufficient conditions have been verified, we have oti^ £ S(]R + ). □ 
Using this lemma, \a^ k - n jk \ < nfc(Kj ^)^ z ^+) ( 2 + ^yp ) ■ Moreover, {a; jfc - r^jv/.fc is a family 
of analytic functions. By Montel's theorem [43], this convergence still holds for < oo < ooq, and that (49a) 
and (49b) hold true. 

A. 2 Proof of Proposition 4 

Using the resolvent identity (Lemma 12) H — \I/ = H — 3 _1 ) *ff, we have 



S - * = wEdiag I I ^2 (%k ~ Sii t k)'Ri,k 

V U=i 

Similarly, 



^ * + a; 2 HHdiag^|^K fc -A ifc e iife )* fc T iifc fc | J H H *. 

(104) 



5 - * = wHdiag (fj2 ikk%h ~ ai >k )^i,k\ ) * + w 2 HH^diag ((j^ (5;,* ~ %k)^l s k®l } ) H*. 

V U=i J vfc/ V U=i J v;/ 



Taking the trace, we get 



tr (S - *) =oo Y^(%k ~ a«,fc)tr (SR,, fc *) + w 2 J^Kfc - A, fc e* )fe )tr (sH$T; fc 0H fl * 



(105) 



(106a) 



tr 



Itr (ST^*) + oo 2 ~ e,, fc )tr (sH^R^GH*) . (106b) 

z,fc 
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From Proposition 3, we have 

a l}k =—tr(R ltk ((E)) l ) + e l)k , (107a) 
n k 

a l)k =— tr(T I|fc (S) fc ) + e Uk , (107b) 
n k 

where e^ k and ii ik converge towards 0. Therefore, 

ai,k-Pi,k%k = — tr (Rz,fc((S - *)),) + e Zjfe 
n k 

2 

=— ^2(e i:j - &ij)tr (R^HR^*) + — EKi - A^e^tr (r^SH^T^-SH^*) + e t>k , (108a) 

ai, k -ei,k = — tr (t^B - *} fe ) + e, fc 

n k \ J 

2 

= — E(&^' " a ^ tr fefcSTy*) + — " g ^) tr (T i , fc 3H ff $R ij eH*) + e I|fc . (108b) 

ILL- > ' I LV> \ f 

Using the same approach as in the proof in Proposition 3, we prove that (aj^ — A,fe e /,fe)' s an d — e/^'s 
converge towards 0. From (106a) and (106b), we complete the proof of Proposition 4. 

A. 3 Proof of Proposition 5 

We first establish (52a) and (52b). The equations (89) and (91) can be rewritten as 
&l,k ~ n,k = 7T E Ki " n,j) tr (T, ifc E{5}T iJ s) + ^tr (t^AE 

2 2 

+ — E ( fi v " f M') tr (T^G)H f/ E{ < S}R lJ HH0) + — tr (t^QH^ASH©) , (109a) 

«/,fc - ?u = — E ("m - f v) tr (Bi,* E W Ri,i H ) + r- tr (R;,fc AH ) 
Ti/j . . n k 

2 2 

+ — E ~ T ^ tr (m,k®K£{<S}T hJ SH H ©) + — tr (R^QHASH^©) . (109b) 

fife . . \ / Tl k \ / 

We can write these two equations in matrix form: 

r 1 = T'r ] + e', (110) 
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where e' = \yec(C 1 ) T , vec(C 2 ) 
and 



r' r' 

1 11 1 12 



r' r' 

1 21 1 22 



, with d, C 2 e c ix *, r' ll5 r' 12) r' 21) r' 22 e c LK * 



LK 



l C 'l]l,k 
[C'2\l,k 



£tr (t^As) + gtr fc >fe 0H^ASH0 



l-gtr(T, ifc 0H^E{5}R iifc HH0 



tr (R (i ,A3) + gtr (R u 0HASH"0 



l-gtr(R^0HE{5}T w SH^0 



0, 



^tr(T^0H^E{5}R i7 SH0 



l-£tr T ijfc 0H^E{«S}R Zifc 3H0 



for (i, j) / (l,k); 
for = (Z,fc), 



[r' 12 ] 



lk,ij 



tr T, fc E{5}T M .H 



1-gtr T, jfc efl*E {5} Resile 



[r' 



2l\lk,ij 



tr (R Zife E{5}R M .3) 



rr' 



22\lk,ij 



1 - gtr (R, ifc 0HE{«S}T ijfc HH^0 



^1 



tr R U ,0HE{5}T M .HH"0 



1 - gtr (R iifc 0HE{«S}T ijfc HH^0 



for (i,j)^(l,k); 



, for = (l,k). 



(111a) 
(111b) 

(111c) 

(nid) 

(llle) 
(lllf) 



Let r" be the matrix by replacing E{«S}, E{«S}, 3, S, and in r' with \I>, <& and 4>, re- 

spectively. Using Propositions 3 and 4, we immediately obtain 



r' = r" + s, 



where all entries of d converge to as W — > oo, and T" is given by 



(112) 



1 11 1 12 
1 21 1 22 



(113) 
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v/ -p// r » -p// ^ mLKxLK &nd 



with r^r^r" r 2 ec 



lk,ij 



0, for 



(2) 
U lk,ij 



l-U 



-, for = (7,fc), 



lk,ij 



7 « 



lk,ij 



(1) 
lk,ij 



l-V 



(2) 

/ /"i . / A 



(2) 



-, m. 



l-u 



(2) ' 
Iklk 



22\lk,ij 



0, £ar(i > i)^(Z > fc); 



7 ( 2 ) 
lk,ij 



l-V 



(2) 



-, for (i, j) = (l,k), 



lk,lk 
2 



''fc lb k y ' 



Lemma 6 Lei T" 6e t/ie matrix defined by (113). Then, we have 



S up[p(r'0]<i--^^<i, 

TV (W + A'q) 



sup 

N 



(i - r'y 1 



< 



A w 2 



/or some constants Aq,Aq. 



Proof: From (14), a direct calculation yields 



Pl,k%k =— tr (R^** 1 *) 



"A- 



1 



— tr Rj fe * wIat + wJ^eij-Rjj +wH#H ff * 



n k 



i.J 



(ii) W 



- £ g v tr (S*,**ay *) + -tr (R ijfc **) 



8.3 

,2 



— £ faeijtr (Rj ^H^T, /H^*) + — tr (r ; fc *H**H^* 



where (i) and (n) are obtained by expanding \l/ 1 and $ , respectively. Similarly, we can get 



"A- 



n k 



+ -E S ^' tr fc #H^$R tJ $H* + — tr T ; fc *H^**H* 
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The equations (116) and (117) can be rewritten as 



nfc 
n 



and 



rik 



n 



-eik 



Pl,kei,k =yZ—ei,j—tr (R 4 *R, fe *) + -tr (R^** 

' n rij ' J ' n ' 

hi 

2 

+ — Ajeij— tr (t^H^R, fc *H<f>) + ^-tr (R i fe *H**H H * 

n 
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T^T^*) + "tr(T, ifc ¥¥ 



+ ^ ^e i} ~tr (R ij *H*T 4i .*H H $) + ^-tr [T i fc *H^**H* 



E ^>^£k + E ^MS* + ^ tr + V tr (^* S *** fi *) . (119) 



(118) 



Now, let £ = [vec(A 3 ) T ,vec(A 4 ) T ] T , b 4 [vec(C 3 ) T , vec(C 4 ) T ] T , T" 

c 3 ,c 4 e c Lx ^,r'/i,r^,r' 2 'i,r^ e c L ^ xL ^ with 



r /,, r /// 



11 12 
21 1 22 



, where A3, A4, 



[C3]; ; fc 
[C4]i,fc 



r a 1 nk ~ 
— Pl,k e l,ki A 4 \l,k — — e l,k, 
n n 



1-u 



(2) 



2 



W" 



ll\lk,ij 



0, 
t (2) 



\r"' 



u 



l-u 



(2) 



= (l,k), 



121lk,ij 



(1) 
ij,lk 



1 (2) : 

1 - u m 



2lJ(fc,ij 



1 (2) : 



lk,ij 



0, (i,j)^(l,k); 



(2) 



1 (2) 
1 — IK.'.. 



= (l,k). 



Thus, from (118) and (119), we have 



£ = r'"£ + b. 



(120a) 
(120b) 

(120c) 
(120d) 



(120e) 



Using the matrix inversion lemma (Lemma 13), we obtain 3>H^\I/ = \I/H^<1>. This implies that 



u 



(121) 

(2) 
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(2) 

v^ y, for Vi,j. We immediately get 



(r") T . (122) 



Now, define 

,(2) 1 „( 2 ) 1 _ o,( 2 ) 1 .,(2) 



A = diag (l - u^ u , . . . , 1 - u^ LK , 1 - v$ tll , . . . , 1 - v^ )LK ) . (123) 
Multiplying both sides of (121) by A gives 

A£ = AT"'£ + Ab. (124) 

For oo £ K + , the entries of £, Ar"' and Ab are positive. Thus, the entries of A£ are positive. Since the 

(2) (2) 

entries of £ are positive, we conclude that 1 — u\-^ > and 1 — v\-\ - > 0, for V7, k. From (121), we obtain 
that the entries of V" and b are positive, for oo € M + . Lemma 15 implies p (T'") < 1 — ™" ^' . 
Using Lemma 8, (13), and the fact that ||*||, ||*|| < jj, we have 

T^k n N\ C max /^O Cmax / 1nP i 

— Pi,kei,k < < (125) 

n noo oo 



and 



^fc ~ ^ ra fcCmax , C max /1n „. 

— e i)fc < < , (126) 

n noo oo 



where /3q = max^i {/^(iV)}. From (10), we have 



C 

supmax^j < supmax{l,/3o} < +oo. (127) 



For hi, we have 



w , ,Ww tr BjLt («)w tr R (fc 3 N w tr R (fc 
6, fe >-tr R, > - V V / ; > V y ^ k " 9 > =^ 



>^ (128) 

n ( w + max{l, ftfiLKC*^ + LKC max ) 2 

where (i) and (ii) follow from 1) — a) of Lemma 8, i.e., (tr(AB)) 2 < tr(AA^)tr(BB^), (Hi) is due to 2) 
of Lemma 8. Similarly, 

? . (129) 

' ^(w + maxlL/JolLKC^ + LETC^) 2 

As a consequence, we have 

infminb; > (130) 

(oo + su Pjv max{l, /3 }LKC max + LKC max ) 2 



where C 5 = infjy max{±tr (T I(fc ) , ±tr (T\ fe )} > 0. 



41 



Combining (127) and (130), we obtain 

su P [p(r'")]<i--^^<i. (i3i) 

According to (122), (115a) holds true. It is easy to get (115b) by p (T") < 1. A similar proof can be found 
in [18,44], and is therefore omitted. □ 

Applying this lemma and (112), there exists Nq such that (I — T') is invertible, for each N > Nq, and 
sup N>No [||| (I - r')" 1 IJ < ■ Note that e' = O (^) 1. Hence, from (110), we obtain (aj,fc-7j )fc )'s 

and — Ti 7 kYs are of O (j^z)- This establishes (52a) and (52b). 

From (52a), (52b), (107a), and (107b), we have e z>fc = O (^) and e hk = O (^). (108a) and (108b) 
can be rewritten as a matrix form similar to (110). Using the same approach as in the proof of (52a) and 
(52b), we prove that {a^k — A,fc e z,fc)' s an d (5ci,k ~ ^l,kY s are of O (777)- This shows that (52c) and (52d) 
are established and the proof is completed. 

B Proof of E{rriB N } — Ejm^} = O {^^j m Theorem 1 

The aim of this appendix is to prove 

|E{m Bj »} - E{m Bi »}| = O (-±=\ . (132) 



We mainly make use of the generalized Lindeberg principle given below. 

Lemma 7 (Generalized Lindeberg Principle [30]) Let v = [vi] £ M n and v = [ft] 6 M n be two random 
vectors with mutually independent components. Define {oj} 1<i<n and {bi} l<i<n with 

ai^|E{^}-E{ft}|, and h ± \E{vf } - E{ft 2 }|. (133) 

Then, given a twice continuously differentiable function f : M n — > K, we have 



|E{/(v)}-E{/(v)}|<£ 



i=l 



i E{\dj(v\-\0,v? +1 )\} + h l E{\d?f(v\-\0,v? +1 )\} 
+ |e|^ \dff {v[-\s,v? +1 ) | ( Vl - sfds} 

±E | £\df f (vi" 1 , 8, v? +1 ) I (ft - s) 2 rfs 



2 



(134) 

where df is the p-fold derivative in the i-th coordinate, v^ 1 = (v\, . . . and v™ +1 = (ft+i, • • • ,v n ). 

As X;,fc' s an d Xi j,'s are matrices with entries satisfying (12), we have af'^ = bf'^ = for i = 1, . . . , N 
and j = 1, ... ,n. Therefore, the remaining challenge is to evaluate the third and fourth terms of the 
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right-hand side of inequality (134). Since the real and imaginary parts of xfj are independent, all the 
results established in the real case can be directly applied for the complex case. Thus, without loss of 
generality, we only take the derivative with respect to the real part of xf- ,k ^ in (134). Before proceeding, 

(l k) 

we remark that because of the finite 6-th order moment assumption of X--' s, the following proof is much 
simpler than that in [19]. 
Let 

where 



/ {Aj |fc } 



\/Lk 



1 tr (G + loIn)- 1 



N 



(135) 



(136) 



and 



G = (S& ^Mk + Si,*) j (e (b& A Iifc Tj fc + &,*) j , 

for any A/^ 6 M Arx ™, for Z = 1, . . . , L and k = 1, . . . , K. As such, we have m& H {u)) = / ^{X^}^ fe 

mB N (u) = / {[ x l,k}\/i >k \ To use ( 134 )> { A «,fc} V i ifc wil1 take the form { A i,k = [A^ k) (l 0l kQ,r, c, s )]} Wfe 
with 



A-j°( z o,fco,r,c,«) = { 



if Z < Z , or Z = Z , k < k , or Z = Z , k = k ,i < r, 
ov I = Iq, k = kQ,i = r, j < c; 
s, if (Z,Zc) = (Z ,A;o) and (i,j) = (r,c); 



(137) 



,(i,k) 



otherwise. 



Taking the third-fold partial derivative of (135) with respect to A\'- , denoted by d\- , we have 



ij J 



-|tr fe) G) (G + wl^)- 1 (flg'*>G) (G + c^)" 1 (<f fc) G) (G + W I 

3 

iV 



iv 



f Itr ((5f )2 G) (G + u^)" 1 (if fc) G) (G + u I N y 



+ Itr ((3# fc) G) (G + (af - )2 G) (G + ^r 2 ) , (138) 



where 



+ IE (s|,fci A 'i.fci^h,*i + Sji.*i) I (^ E ^) , 



5 



( '' fc)2 G =2Tg fe) R^ fe E ii Rj fe . 



(139a) 
(139b) 



Here, Ejj denotes the matrix which has its entries being all O's except for the (i,j)-th entry as 1. 
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Using Lemma 8, the first term of df? f can be bounded by 



tr m k) G) (G + uI N y l {&t- k) G) (G + coIn)' 1 {d\ f J G) (G + ul N ) 



< 4 II (^G) in, 



(140) 



and the second and third terms of df? k f can be bounded by 



tr ( (dff )l G) (G + ujIn)- 1 (cf fcj G) (G + ul 



N) 



< ^ll(^' fc)2 G)|| F ||(a;f } g)|| f . 



(141) 



From (139a) and (139b), using Lemma 8 and (13), we obtain 



||(df } G)|| F <2 £ ( llBjfcEo-TjfcTj^A^H*^ || F + IIR^E^H^ || F 



(0 



i i 



(ii) 



- 2 E ( CmaxllEyT^T^^A^fcJIp + IIR^E^T^H^^JIf 



-2C max ^ 
h 



tr I E, 3 -T 1 ^T 1 » ill A» h A, 1 , fcl T,' iiti T 1 ^E i 



h,ki 



A' 



Z,fe \j=l 



■± 

2 

max 



(142) 



and 



||(5^G)|| F = 2t-rmi < 2||Ti, fe ||||R^|| < 2C n 



-03 — w 



(143) 



where (i) is obtained by the triangle inequality of the Frobenius norm, {ii) follows from 1(6) of Lemma 
and (13), and {Hi) follows from 2 of Lemma 8 and (13). Combining everything together, we get 

3- 

e {dt k0)3 f} <^\Iy,(y, (A!k k) Y) 2 +c 2 " 



l,k \i=l 

WCi(LA' + l) 2 



< 



(") C 3 
< — 
~ N 



N 



( 



l,k \i=l 

N 



+ G 



|s| + E < 



£ (4°' fc0) 



L K / N 
i^Zo fc^fco \i=l 



( =§(M 3 + c 4 ), 



(144) 
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where C%, C2, C3, C4 denote constants, (i) is obtained by Lemma 9, (w) follows from the definition of •' 
(137), and (ra) is due to the fact that X^' k ' and xfj have finite 6-th order moment, thus giving the 
second and third terms of third line of (144) as 0(1). 
Finally, using (134) and (144), we obtain 




The quantity |E {mB N (u)}} — E{Q {mjs N (^)}}\ also admits the same upper bound. Thus, (132) is true. 

C Existence and Uniqueness 
C.l Existence 

Following [16] and using Proposition 3 the existence of (e;^, ei s k)\/i,k can be shown. 
C.2 Uniqueness 

Let (e;^, e^) and (e z ° fc , e° lk ) be two solutions satisfying (14), and Vl/ , ^ , <&°, <J> be the matrices obtained 
by replacing e; j fc(w)'s and ef fc(u;)'s in Vl*, <1>, 3> with e z ° fc (u;)'s and e z ° fc (u;)'s respectively. To prove the 
uniqueness, we need to show that e^k — e° lk = and e^u — e° lk = 0, for any I and k. Our proof is inspired 
by [18]. 
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A standard calculation involving Lemma 11 yields 



Pl,k( e l,k ~ el k ) 



e i,k ~ e, k 



. ■E( e M'-^)tr(R / , fc *R M *°) 

hJ 



-1,3 



-eh)tr{T^T itJ V 



+ — " e M') tr T^^H^R^ !!*' 



Now, let £ = [vec(As) T , vec(Ag 

C LKxLK with 



nn n 12 

II21 II22 



where A 5 ,A 6 G c Lxi ^, nu, ni 2 , n 21 , n 22 G 



(146a) 



(146b) 



[A 5 h,fc = A,fc(e«,fc - e,° fc ), 



\n 



ll\lk,ij 



0, for (U)^(Z,fc); 



^tr ^.fc^H*^* H^*° 
1 " gtr (r u ,*H*T^*°H^* ( 



for = (/, 



fn 



12jjjfe,ij 



.£tr(R 4l **Hy*< 



fn 



22jZfc,y 



1 " gtr (R^*H*T, )fc #°H**< 



■^trfS,,**^* 



1 " gtr (T^*H^*R ; >t $°H* 



^tr f T, ,.*H H *R,- ,*°H* 



l-gtr(T Jfe *H^*R u *°H* 



for (i, j) / {l,k); 



Thus, (146a) and (146b) can be written together as 



C = nc 



(147a) 
(147b) 

(147c) 
(147d) 

(147e) 



(148) 



To complete the proof, it remains to prove that p(n) < 1. To do so, we first write (116) and (117) in 
matrix form as follows: 

£' = K£' + b', (149) 

Ku Kia ' 
K 2 i K 22 



where £' = [vec(A 7 ) T , vec(A 8 ) T ] T , b' = [vec(C 7 ) r , vec(C 8 ) r ] T , K 



, and A 7 , A 8 ,C 7 ,C 8 G 
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C LxX ,K ll5 K 12 ,K 21 ,K 22 eC 



LKxLK 



with 



tr (Rz fc**) + *£tr (Rj fc *H**H^*) 



[K 



[ C l]l,k 
[Cs]l,k 

ll]lk,ij 



2l\lk,ij 



Hk,ij 

'(i) 
lk,ij 



l-U 



1(2) 



1 - v 



f(2) 



/(2) 



l-U 



/(2) : 

/ A.' , / /i. 



for / (Z,fc); 
for (i, j) = (l,k), 



\K 



u 



12\lk,ij 



'(1) 
lk,ij 



l-U 



,(2) ■ 
Ik, Ik 



l-V 



l(2) ■ 
lk,lk 



\K 



22\lk,ij 



0, for (i,j)^(l,k); 



1(2) 
lk,ij 



1 - V 



■hf, for = (l,k), 



£-tr (R iifc *R 4J *) , u^i. = =-tr (Rj, fc * 



/(2) 

tJ *~ / i d lk,ij 



'(2) 
2 



2 



v'W. = — tr ( T, fc *H H *R,- ,*H* 



— h3 



Using a similar approach of (121), we get that 1 — uf?\u > 0, 1 — v'^\ k > 0, V/, k, and the entries of K 
and b' are positive, for cj € M + . Therefore, from (149) and Lemma 16, we have p(K) < 1. Similarly, we 
also have p(K°) < 1, where K° as well as K° 1; K^ 2 , K| 1? and K 22 are the matrices by replacing 
and with \l/ , , 3>°, and <l> , respectively. 

— i i 



,'(2) 



(150a) 
(150b) 

(150c) 
(150d) 



(150e) 

(150f) 
(150g) 



iT^*°H H * R^ satisfying tr(AA H ) 



4% < 1 an d tr(BB") = v'Sl < 1, we have 



1 - — tr f R, fc *H*T, i°H H *° 

n k V ' 



- — tr (Rj )fc ¥H*T, fc *H ff * 

Ttk ^ 



1 - — tr f Rj fc *°H*°T, t $°H fl *° 



a,fe ■ 



• (151) 



Applying the Cauchy-Schwarz inequality to the numerator of [IIii]^^ and from (151), we obtain 



|pncii]zjfe,ijl < 



nk 



^tr ( R, fc *H*T.- „-*H a * 



%3 



1 - ^tr R t *H*T, t *H^* 



^tr ( R k *°H*°T ^H^* 
1 " gtr (R U *°H*°T U ,*°H^* C 



/(2) 

Hk,ij 



l-u 



1(2) 
Ik, Ik 



U 



/o(2) 
lk,ij 



l-U 



lo(2) 



lk,ij 



(152) 
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Likewise, we have 



m 



12\lk,ij\ < |[Ki2j/fc,jj| z l[ lv 12j«fc,ijM i 

1 



i 



I 5 I IK?, 



i 



[n2l]/fe,ij| < |[K2l] Zfc,y I 2 I [Ko 
1 



2l\lk,ij\ > 
1 



[H22]jfc,ij| < | [K 2 2]zfc,ij I 2 |[K| 2 ]ifc,iil 2 • 



Using Lemma 18 and Lemma 19, we obtain 



(153a) 
(153b) 
(153c) 



p(n)<p(|n|)<p(K)* p (K )3<i. 



(154) 



This contradicts to the statement that II has an eigenvalue equal to 1. Therefore, we have e^k ~ e< i k = ® 
and ei t k — e^ fe = 0, for any I, k and us G M + . 

D Mathematical Tools 



In this appendix, we provide some mathematical tools used in the proof of the appendices. 
Lemma 8 [45] 

1. Let A = [Aij] and B be any matrices such that the product is a square matrix. Then, 

(a) |tr(AB)| < || A|| F ||B|| F , 

(b) ||AB|| F < ||A|| F ||B||, 

(c) ||AB|| F < ||A||f||B|| f , 

(d) \Aij\ < || A|| . 

2. If A is nonnegative definite, we have |tr(AB)| < ||B||tr(A). 

3. Let A be any matrix such that the product AB exists. Then, ||AB|| < ||A||||B||. 

Lemma 9 For any p > 1 and real numbers ai 's, we have 



< n p - l ^\ ai \ p . 



(155) 



i=l 



Lemma 10 [42, Theorem 4.3.1] Let A and B be Hermitian matrix and let the eigenvalues Aj(A), A,;(B), 
and Aj(A + B) be arranged in decreasing order. For each k = 1, 2, . . . , n, we have 



A fc (A) + A n (B) < A fc (A + B) < A fc (A) + Ai(B). 



A T 



A T 



(156) 

T 



Lemma 11 Let matrix A/^ G C 7ViXnfe for I = l,...,L,k = 1, . . . , K, and let 

C Nxn ",A = [Ai,-- - ,A K ] G C Nxn , with N = J2i=i N i and n = Hk=i n k- If \\ A l,ktf k \\ < C, then we 
have \\AA H \\ < LKC. 
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Proof: Notice that AA H and A/^A^. are Hermitian matrices. Therefore, a standard computation 
involving Lemma 10 yields 



|AA*||=Ax (AA fl )=Ai (|> A ") 

<f> (A fe Af)=f> (AfAfc) 

fc=l fc=l 

fc=l \z=i / ifc 



= Y,\\*-l,kHk\\<LKC. (157) 

□ 

Lemma 12 (Resolvent Identity) For invertible A and B matrices, we have the identity 

A 1 - B 1 = A _1 (B - A)B _1 . (158) 

Lemma 13 (Matrix Inversion) For invertible A, B and R matrices, suppose that B = A + XRY, then 

B 1 = A" 1 - A" 1 X(R- 1 + YA" 1 X)- 1 YA- 1 . 

Lemma 14 Assume that A is a positive seme-definite M x M matrix and B = diag(Bi, . . . ,B#) is a 
block- diagonal matrix, where B& is a positive seme-definite x M& matrix and M = X^fcLi-^fc- Let 
C k = ((I + AB\ fc ) _1 A) fc , where B\ k = diag(Bi, . . . , B fc _i, 0, B k+1 , . . .,B K ). Then, we have 

((I + AB)- 1 A) fe = (I + CfcBfe)" 1 ^. (159) 

Proof: Letting B fc = diag(0, . . . , 0, B k , 0, . . . , 0), we have 

(I + AB) _1 A = (I + AB\ fe + ABfc)" 1 A 

( = ] C - C (I + B.C)- 1 B fc C = C (i - ((B fc C) _1 + I)" 1 ) 
^C^C^^O^+I)- 1 ) =(I + CB fe )- 1 C (160) 

where (i) follows from Lemma 13 and defining C = (I + AB^) 1 A, (ii) is due to Lemma 12. Substituting 
(160) into (159), we obtain 

((I + AB)- 1 A) fc = ((I + CB,.)- 1 C) k = (I + C.Bfc)- 1 ^, 

where the last step is obtained by calculating the inverse of (I + CB fc ) _1 . □ 
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Lemma 15 [42, Corollary 8.1.29] Let A G R nxn , x G M n , /or A > and x > 0. If a, f3 > are such that 
ax < Ax < /3x, i/ien a < p(A) < (3. If ax < Ax, then a < p(A). //Ax < /3x, i/ien p(A) < (3. 

Lemma 16 [16, Lemma 9] // the components of C,x, and b are all positive, then x = Cx + b implies 



Lemma 17 [19, Lemma 16] Let A and B be any matrices such that AB^ exists and is a squared matrix. 



Lemma 18 [42, Theorem 8.1.18] Let A = [Aij] and B = [Bij] be square matrices. If \Aij\ < Bij,Vi,j, 
then p(A) < p(|A|) < p(B). 

Lemma 19 [45, Lemma 5.7.9] Let A = [Aij] and B = [B^] be matrices with nonnegative elements. Then 
p^B^KpiA^piB)^. 
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